I need your help!

I want your feedback to make the book better for you and other readers. If you find typos, errors, or places where the text may be improved, please let me know. The best ways to provide feedback are by GitHub or hypothes.is annotations.

Opening an issue or submitting a pull request on GitHub: https://github.com/isaactpetersen/Fantasy-Football-Analytics-Textbook

Hypothesis Adding an annotation using hypothes.is. To add an annotation, select some text and then click the symbol on the pop-up menu. To see the annotations of others, click the symbol in the upper right-hand corner of the page.

11  Mixed Models

11.1 Getting Started

11.1.1 Load Packages

Code
library("petersenlab")
library("lme4")
library("lmerTest")
library("MuMIn")
library("emmeans")
library("sjstats")
library("mgcv")
library("AICcmodavg")
library("parallel")
library("plotly")
library("viridis")
library("tidyverse")

11.1.2 Specify Package Options

Code
emm_options(lmerTest.limit = 100000)
emm_options(pbkrtest.limit = 100000)

11.1.3 Load Data

Code
load(file = "./data/nfl_depthCharts.RData")
load(file = "./data/player_stats_weekly.RData")
load(file = "./data/player_stats_seasonal.RData")

We created the player_stats_weekly.RData and player_stats_seasonal.RData objects in Section 3.21.3.

11.2 Overview of Mixed Models

We will discuss a modeling framework that goes by many terms, including mixed models, mixed-effects models, multilevel models, hierarchical linear models.

11.2.1 Ecological Fallacy

11.2.2 Simpson’s Paradox

11.3 Fantasy Points Per Season by Position, Age, and Experience

Code
player_stats_seasonal_offense_subset <- player_stats_seasonal_offense %>% 
  filter(position_group %in% c("QB","RB","WR","TE"))

player_stats_seasonal_offense_subset$position[which(player_stats_seasonal_offense_subset$position == "HB")] <- "RB"

player_stats_seasonal_kicking_subset <- player_stats_seasonal_kicking %>% 
  filter(position == "K")

player_stats_seasonal_offense_subset <- bind_rows(
  player_stats_seasonal_offense_subset,
  player_stats_seasonal_kicking_subset
)

player_stats_seasonal_offense_subset$player_idFactor <- factor(player_stats_seasonal_offense_subset$player_id)
player_stats_seasonal_offense_subset$positionFactor <- factor(player_stats_seasonal_offense_subset$position)
Code
seasons17week <- 2001:2020
seasons18week <- 2021:max(nfl_depthCharts$season, na.rm = TRUE)

endOfSeasonDepthCharts <- nfl_depthCharts %>% 
  filter((season %in% seasons17week & week == 18) | (season %in% seasons18week & week == 19)) # get end-of-season depth charts

qb1s <- endOfSeasonDepthCharts %>% 
  filter(position == "QB", depth_team == 1)

fb1s <- endOfSeasonDepthCharts %>% 
  filter(position == "FB", depth_team == 1)

k1s <- endOfSeasonDepthCharts %>% 
  filter(position == "K", depth_team == 1)

rb1s <- endOfSeasonDepthCharts %>% 
  filter(position == "RB", depth_team == 1)

wr1s <- endOfSeasonDepthCharts %>% 
  filter(position == "WR", depth_team == 1)

te1s <- endOfSeasonDepthCharts %>% 
  filter(position == "TE", depth_team == 1)

player_stats_seasonal_offense_subsetDepth <- player_stats_seasonal_offense_subset %>% 
  filter(player_id %in% c(
    qb1s$gsis_id,
    fb1s$gsis_id,
    k1s$gsis_id,
    rb1s$gsis_id,
    wr1s$gsis_id,
    te1s$gsis_id
    ))

11.3.1 Scatterplots of Fantasy Points by Age and Position

Scatterplots are a helpful tool for quickly examining the association between two variables. However, scatterplots—as well as correlation and multiple regression—can hide meaningful associations that differ across units of analysis.

11.3.1.1 Quarterbacks

A scatterplot of Quarterbacks’ fantasy points by age is in Figure 11.1.

Code
plot_scatterplotFantasyPointsByAgeQB <- ggplot(
  data = player_stats_seasonal_offense_subset %>% 
    filter(position == "QB") %>% 
    mutate(
      age = round(age, 2),
      fantasy_points = round(fantasy_points, 2)
    ),
  mapping = aes(
    x = age,
    y = fantasy_points,
    color = player_id)) +
  geom_point(
    aes(
      text = player_display_name, # add player name for mouse over tooltip
      label = season # add season for mouse over tooltip
  )) +
  geom_smooth(
    mapping = aes(
    x = age,
    y = fantasy_points),
    inherit.aes = FALSE
  ) +
  scale_color_viridis(discrete = TRUE) +
  labs(
    x = "Player Age (years)",
    y = "Fantasy Points (Season)",
    title = "Fantasy Points (Season) by Player Age: Quarterbacks"
  ) +
  theme_classic() +
  theme(legend.position = "none")

ggplotly(
  plot_scatterplotFantasyPointsByAgeQB,
  tooltip = c("age","fantasy_points","text","label"))
Figure 11.1: Scatterplot of Fantasy Points by Age for Quarterbacks.

Based on the scatterplot (and the bivariate association below), Quarterbacks’ fantasy points appear to increase with age.

Code
cor.test(
  formula = ~ age + fantasy_points,
  data = player_stats_seasonal_offense_subset %>% filter(position == "QB")
)

    Pearson's product-moment correlation

data:  age and fantasy_points
t = 3.2358, df = 1051, p-value = 0.001251
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.03914186 0.15877861
sample estimates:
       cor 
0.09931915 

11.3.1.2 Fullbacks

A scatterplot of Fullbacks’ fantasy points by age is in Figure 11.2.

Code
plot_scatterplotFantasyPointsByAgeFB <- ggplot(
  data = player_stats_seasonal_offense_subset %>% 
    filter(position == "FB") %>% 
    mutate(
      age = round(age, 2),
      fantasy_points = round(fantasy_points, 2)
    ),
  mapping = aes(
    x = age,
    y = fantasy_points,
    color = player_id)) +
  geom_point(
    aes(
      text = player_display_name, # add player name for mouse over tooltip
      label = season # add season for mouse over tooltip
  )) +
  geom_smooth(
    mapping = aes(
    x = age,
    y = fantasy_points),
    inherit.aes = FALSE
  ) +
  scale_color_viridis(discrete = TRUE) +
  labs(
    x = "Player Age (years)",
    y = "Fantasy Points (Season)",
    title = "Fantasy Points (Season) by Player Age: Fullbacks"
  ) +
  theme_classic() +
  theme(legend.position = "none")

ggplotly(
  plot_scatterplotFantasyPointsByAgeFB,
  tooltip = c("age","fantasy_points","text","label"))
Figure 11.2: Scatterplot of Fantasy Points by Age for Fullbacks.

Based on the scatterplot, Fullbacks’ fantasy points appear to be relatively stable across ages.

11.3.1.3 Running Backs

A scatterplot of Running Backs’ fantasy points by age is in Figure 11.3.

Code
plot_scatterplotFantasyPointsByAgeRB <- ggplot(
  data = player_stats_seasonal_offense_subset %>% 
    filter(position == "RB") %>% 
    mutate(
      age = round(age, 2),
      fantasy_points = round(fantasy_points, 2)
    ),
  mapping = aes(
    x = age,
    y = fantasy_points,
    color = player_id)) +
  geom_point(
    aes(
      text = player_display_name, # add player name for mouse over tooltip
      label = season # add season for mouse over tooltip
  )) +
  geom_smooth(
    mapping = aes(
    x = age,
    y = fantasy_points),
    inherit.aes = FALSE
  ) +
  scale_color_viridis(discrete = TRUE) +
  labs(
    x = "Player Age (years)",
    y = "Fantasy Points (Season)",
    title = "Fantasy Points (Season) by Player Age: Running Backs"
  ) +
  theme_classic() +
  theme(legend.position = "none")

ggplotly(
  plot_scatterplotFantasyPointsByAgeRB,
  tooltip = c("age","fantasy_points","text","label"))
Figure 11.3: Scatterplot of Fantasy Points by Age for Running Backs.

Based on the scatterplot, Running Backs’ fantasy points appear to be relatively stable across ages.

11.3.1.4 Wide Receivers

A scatterplot of Wide Receivers’ fantasy points by age is in Figure 11.4.

Code
plot_scatterplotFantasyPointsByAgeWR <- ggplot(
  data = player_stats_seasonal_offense_subset %>% 
    filter(position == "WR") %>% 
    mutate(
      age = round(age, 2),
      fantasy_points = round(fantasy_points, 2)
    ),
  mapping = aes(
    x = age,
    y = fantasy_points,
    color = player_id)) +
  geom_point(
    aes(
      text = player_display_name, # add player name for mouse over tooltip
      label = season # add season for mouse over tooltip
  )) +
  geom_smooth(
    mapping = aes(
    x = age,
    y = fantasy_points),
    inherit.aes = FALSE
  ) +
  scale_color_viridis(discrete = TRUE) +
  labs(
    x = "Player Age (years)",
    y = "Fantasy Points (Season)",
    title = "Fantasy Points (Season) by Player Age: Wide Receivers"
  ) +
  theme_classic() +
  theme(legend.position = "none")

ggplotly(
  plot_scatterplotFantasyPointsByAgeWR,
  tooltip = c("age","fantasy_points","text","label"))
Figure 11.4: Scatterplot of Fantasy Points by Age for Wide Receivers.

Based on the scatterplot, Wide Receivers’ fantasy points appear to be relatively stable across ages.

11.3.1.5 Tight Ends

A scatterplot of Tight Ends’ fantasy points by age is in Figure 11.5.

Code
plot_scatterplotFantasyPointsByAgeTE <- ggplot(
  data = player_stats_seasonal_offense_subset %>% 
    filter(position == "TE") %>% 
    mutate(
      age = round(age, 2),
      fantasy_points = round(fantasy_points, 2)
    ),
  mapping = aes(
    x = age,
    y = fantasy_points,
    color = player_id)) +
  geom_point(
    aes(
      text = player_display_name, # add player name for mouse over tooltip
      label = season # add season for mouse over tooltip
  )) +
  geom_smooth(
    mapping = aes(
    x = age,
    y = fantasy_points),
    inherit.aes = FALSE
  ) +
  scale_color_viridis(discrete = TRUE) +
  labs(
    x = "Player Age (years)",
    y = "Fantasy Points (Season)",
    title = "Fantasy Points (Season) by Player Age: Tight Ends"
  ) +
  theme_classic() +
  theme(legend.position = "none")

ggplotly(
  plot_scatterplotFantasyPointsByAgeTE,
  tooltip = c("age","fantasy_points","text","label"))
Figure 11.5: Scatterplot of Fantasy Points by Age for Tight Ends.

Based on the scatterplot (and the bivariate association below), Tight Ends’ fantasy points appear to increase with age.

Code
cor.test(
  formula = ~ age + fantasy_points,
  data = player_stats_seasonal_offense_subset %>% filter(position == "TE")
)

    Pearson's product-moment correlation

data:  age and fantasy_points
t = 5.3884, df = 1341, p-value = 8.388e-08
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.09280906 0.19753022
sample estimates:
      cor 
0.1455774 

11.3.2 Plots of Raw Trajectories of Fantasy Points By Age and Player

Scatterplots can be helpful for quickly visualizing the association between two variables. However, as mentioned earlier, scatterplots can hide the association between variables at different units of analysis. For instance, consider that we are trying to predict how a player will perform based on their age. We are interested not only in what the association is between age and fantasy points between players (i.e., a between-person association). We are also interested in what the association is between age and fantasy points within a given player (and within each player; i.e., a within-individual association). Arguably, the within-individual association between age and fantasy points is more relevant to the prediction of performance than the association between age and fantasy points between players. Assuming that the between-player association between age and fantasy points is the same as the within-player association when it is not is an example of the ecological fallacy.

Below, we depict players’ raw trajectories of fantasy points as a function of age. These are known as spaghetti plots. By examining the trajectory for each player, we can get a better understanding of hor performance changes (within an individual) as a function of age.

11.3.2.1 Quarterbacks

A plot of Quarterbacks’ raw fantasy points data by age is in Figure 11.6.

Code
plot_rawFantasyPointsByAgeQB <- ggplot(
  data = player_stats_seasonal_offense_subset %>% 
    filter(position == "QB") %>% 
    mutate(
      age = round(age, 2),
      fantasy_points = round(fantasy_points, 2)
    ),
  mapping = aes(
    x = age,
    y = fantasy_points,
    color = player_id)) +
  geom_line(
    aes(
      text = player_display_name, # add player name for mouse over tooltip
      label = season # add season for mouse over tooltip
  )) +
  scale_color_viridis(discrete = TRUE) +
  labs(
    x = "Player Age (years)",
    y = "Fantasy Points (Season)",
    title = "Fantasy Points (Season) by Player Age: Quarterbacks"
  ) +
  theme_classic() +
  theme(legend.position = "none")

ggplotly(
  plot_rawFantasyPointsByAgeQB,
  tooltip = c("age","fantasy_points","text","label"))
Figure 11.6: Plot of Raw Trajectories of Fantasy Points by Age for Quarterbacks.

11.3.2.2 Fullbacks

A plot of Fullbacks’ raw fantasy points data by age is in Figure 11.7.

Code
plot_rawFantasyPointsByAgeFB <- ggplot(
  data = player_stats_seasonal_offense_subset %>% 
    filter(position == "FB") %>% 
    mutate(
      age = round(age, 2),
      fantasy_points = round(fantasy_points, 2)
    ),
  mapping = aes(
    x = age,
    y = fantasy_points,
    color = player_id)) +
  geom_line(
    aes(
      text = player_display_name, # add player name for mouse over tooltip
      label = season # add season for mouse over tooltip
  )) +
  scale_color_viridis(discrete = TRUE) +
  labs(
    x = "Player Age (years)",
    y = "Fantasy Points (Season)",
    title = "Fantasy Points (Season) by Player Age: Fullbacks"
  ) +
  theme_classic() +
  theme(legend.position = "none")

ggplotly(
  plot_rawFantasyPointsByAgeFB,
  tooltip = c("age","fantasy_points","text","label"))
Figure 11.7: Plot of Raw Trajectories of Fantasy Points by Age for Fullbacks.

11.3.2.3 Running Backs

A plot of Running Backs’ raw fantasy points data by age is in Figure 11.8.

Code
plot_rawFantasyPointsByAgeRB <- ggplot(
  data = player_stats_seasonal_offense_subset %>% 
    filter(position == "RB") %>% 
    mutate(
      age = round(age, 2),
      fantasy_points = round(fantasy_points, 2)
    ),
  mapping = aes(
    x = age,
    y = fantasy_points,
    color = player_id)) +
  geom_line(
    aes(
      text = player_display_name, # add player name for mouse over tooltip
      label = season # add season for mouse over tooltip
  )) +
  scale_color_viridis(discrete = TRUE) +
  labs(
    x = "Player Age (years)",
    y = "Fantasy Points (Season)",
    title = "Fantasy Points (Season) by Player Age: Running Backs"
  ) +
  theme_classic() +
  theme(legend.position = "none")

ggplotly(
  plot_rawFantasyPointsByAgeRB,
  tooltip = c("age","fantasy_points","text","label"))
Figure 11.8: Plot of Raw Trajectories of Fantasy Points by Age for Running Backs.

11.3.2.4 Wide Receivers

A plot of Wide Receivers’ raw fantasy points data by age is in Figure 11.9.

Code
plot_rawFantasyPointsByAgeWR <- ggplot(
  data = player_stats_seasonal_offense_subset %>% 
    filter(position == "WR") %>% 
    mutate(
      age = round(age, 2),
      fantasy_points = round(fantasy_points, 2)
    ),
  mapping = aes(
    x = age,
    y = fantasy_points,
    color = player_id)) +
  geom_line(
    aes(
      text = player_display_name, # add player name for mouse over tooltip
      label = season # add season for mouse over tooltip
  )) +
  scale_color_viridis(discrete = TRUE) +
  labs(
    x = "Player Age (years)",
    y = "Fantasy Points (Season)",
    title = "Fantasy Points (Season) by Player Age: Wide Receivers"
  ) +
  theme_classic() +
  theme(legend.position = "none")

ggplotly(
  plot_rawFantasyPointsByAgeWR,
  tooltip = c("age","fantasy_points","text","label"))
Figure 11.9: Plot of Raw Trajectories of Fantasy Points by Age for Wide Receivers.

11.3.2.5 Tight Ends

A plot of Tight Ends’ raw fantasy points data by age is in Figure 11.10.

Code
plot_rawFantasyPointsByAgeTE <- ggplot(
  data = player_stats_seasonal_offense_subset %>% 
    filter(position == "TE") %>% 
    mutate(
      age = round(age, 2),
      fantasy_points = round(fantasy_points, 2)
    ),
  mapping = aes(
    x = age,
    y = fantasy_points,
    color = player_id)) +
  geom_line(
    aes(
      text = player_display_name, # add player name for mouse over tooltip
      label = season # add season for mouse over tooltip
  )) +
  scale_color_viridis(discrete = TRUE) +
  labs(
    x = "Player Age (years)",
    y = "Fantasy Points (Season)",
    title = "Fantasy Points (Season) by Player Age: Tight Ends"
  ) +
  theme_classic() +
  theme(legend.position = "none")

ggplotly(
  plot_rawFantasyPointsByAgeTE,
  tooltip = c("age","fantasy_points","text","label"))
Figure 11.10: Plot of Raw Trajectories of Fantasy Points by Age for Tight Ends.

11.3.3 Mixed Models

By accounting for which player each observation comes from using mixed models, we can examine the association between age and fantasy points in a more meaningful way, without violating the assumption in multiple regression that the observations are independent (i.e., that the residuals are uncorrelated).

11.3.3.1 Null Model

Code
pointsPerSeason_nullModel <- lmerTest::lmer(
  fantasy_points ~ 1 + (1 | player_idFactor),
  data = player_stats_seasonal_offense_subset,
  REML = FALSE,
  control = lmerControl(optimizer = "bobyqa")
)

summary(pointsPerSeason_nullModel)
Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's
  method [lmerModLmerTest]
Formula: fantasy_points ~ 1 + (1 | player_idFactor)
   Data: player_stats_seasonal_offense_subset
Control: lmerControl(optimizer = "bobyqa")

     AIC      BIC   logLik deviance df.resid 
148044.0 148066.6 -74019.0 148038.0    13529 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-6.5321 -0.4328 -0.1682  0.3538  5.5810 

Random effects:
 Groups          Name        Variance Std.Dev.
 player_idFactor (Intercept) 2197     46.87   
 Residual                    2335     48.32   
Number of obs: 13532, groups:  player_idFactor, 3358

Fixed effects:
             Estimate Std. Error        df t value Pr(>|t|)    
(Intercept)   44.6942     0.9595 3834.9436   46.58   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Code
MuMIn::r.squaredGLMM(pointsPerSeason_nullModel)
     R2m       R2c
[1,]   0 0.4847453
Code
performance::icc(pointsPerSeason_nullModel)
Code
AIC(pointsPerSeason_nullModel)
[1] 148044

11.3.3.2 Model with Position as Fixed-Effect Predictor

Code
pointsPerSeason_position <- lmerTest::lmer(
  fantasy_points ~ positionFactor + (1 | player_idFactor),
  data = player_stats_seasonal_offense_subset,
  REML = FALSE,
  control = lmerControl(optimizer = "bobyqa")
)

summary(pointsPerSeason_position)
Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's
  method [lmerModLmerTest]
Formula: fantasy_points ~ positionFactor + (1 | player_idFactor)
   Data: player_stats_seasonal_offense_subset
Control: lmerControl(optimizer = "bobyqa")

     AIC      BIC   logLik deviance df.resid 
147766.3 147818.9 -73876.2 147752.3    13525 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-6.5274 -0.4477 -0.1408  0.3593  5.5407 

Random effects:
 Groups          Name        Variance Std.Dev.
 player_idFactor (Intercept) 1940     44.04   
 Residual                    2336     48.33   
Number of obs: 13532, groups:  player_idFactor, 3358

Fixed effects:
                 Estimate Std. Error       df t value Pr(>|t|)    
(Intercept)        15.536      4.447 3449.114   3.494 0.000482 ***
positionFactorQB   61.285      5.178 3470.831  11.836  < 2e-16 ***
positionFactorRB   37.890      4.787 3499.181   7.915 3.28e-15 ***
positionFactorTE    9.873      4.903 3499.455   2.014 0.044136 *  
positionFactorWR   27.271      4.693 3490.629   5.811 6.75e-09 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) pstFQB pstFRB pstFTE
postnFctrQB -0.859                     
postnFctrRB -0.929  0.798              
postnFctrTE -0.907  0.779  0.842       
postnFctrWR -0.948  0.814  0.880  0.859
Code
MuMIn::r.squaredGLMM(pointsPerSeason_position)
            R2m      R2c
[1,] 0.06139709 0.487171
Code
emmeans::emmeans(pointsPerSeason_position, "positionFactor")
 positionFactor emmean   SE   df lower.CL upper.CL
 FB               15.5 4.45 2896     6.81     24.3
 QB               76.8 2.65 2970    71.62     82.0
 RB               53.4 1.77 3241    49.95     56.9
 TE               25.4 2.07 3159    21.35     29.5
 WR               42.8 1.50 3284    39.86     45.8

Degrees-of-freedom method: kenward-roger 
Confidence level used: 0.95 
Code
performance::icc(pointsPerSeason_position)
Code
AIC(pointsPerSeason_position)
[1] 147766.3

11.3.3.3 Identify the Best-Fitting Functional Form of Age

11.3.3.3.1 Linear Models
Code
pointsPerSeason_positionAgeFixedLinearSlopes <- lmerTest::lmer(
  fantasy_points ~ positionFactor + ageCentered20 + (1 | player_idFactor),
  data = player_stats_seasonal_offense_subset,
  REML = FALSE,
  control = lmerControl(optimizer = "bobyqa")
)

summary(pointsPerSeason_positionAgeFixedLinearSlopes)
Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's
  method [lmerModLmerTest]
Formula: 
fantasy_points ~ positionFactor + ageCentered20 + (1 | player_idFactor)
   Data: player_stats_seasonal_offense_subset
Control: lmerControl(optimizer = "bobyqa")

     AIC      BIC   logLik deviance df.resid 
 71242.9  71297.1 -35613.5  71226.9     6437 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-6.3406 -0.4440 -0.1160  0.3999  4.5152 

Random effects:
 Groups          Name        Variance Std.Dev.
 player_idFactor (Intercept) 2555     50.55   
 Residual                    2682     51.79   
Number of obs: 6445, groups:  player_idFactor, 1292

Fixed effects:
                 Estimate Std. Error       df t value Pr(>|t|)    
(Intercept)        22.102      8.801 1433.812   2.511 0.012139 *  
positionFactorQB   89.929      9.765 1335.477   9.209  < 2e-16 ***
positionFactorRB   47.609      9.254 1353.881   5.145 3.07e-07 ***
positionFactorTE   16.785      9.348 1352.423   1.796 0.072779 .  
positionFactorWR   34.404      9.045 1355.844   3.804 0.000149 ***
ageCentered20      -1.495      0.249 5966.317  -6.005 2.02e-09 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) pstFQB pstFRB pstFTE pstFWR
postnFctrQB -0.869                            
postnFctrRB -0.926  0.829                     
postnFctrTE -0.913  0.821  0.867              
postnFctrWR -0.947  0.848  0.896  0.887       
ageCentrd20 -0.180 -0.020  0.033  0.012  0.027
Code
MuMIn::r.squaredGLMM(pointsPerSeason_positionAgeFixedLinearSlopes)
            R2m       R2c
[1,] 0.09922355 0.5386768
Code
emmeans::emmeans(pointsPerSeason_positionAgeFixedLinearSlopes, "positionFactor")
 positionFactor emmean   SE   df lower.CL upper.CL
 FB               12.5 8.67 1226    -4.49     29.6
 QB              102.5 4.53 1173    93.58    111.3
 RB               60.1 3.28 1266    53.72     66.6
 TE               29.3 3.53 1253    22.39     36.2
 WR               46.9 2.63 1310    41.79     52.1

Degrees-of-freedom method: kenward-roger 
Confidence level used: 0.95 
Code
emmeans::emmeans(pointsPerSeason_positionAgeFixedLinearSlopes, "ageCentered20")
 ageCentered20 emmean   SE   df lower.CL upper.CL
           6.4   50.3 2.24 1225     45.9     54.7

Results are averaged over the levels of: positionFactor 
Degrees-of-freedom method: kenward-roger 
Confidence level used: 0.95 
Code
performance::icc(pointsPerSeason_positionAgeFixedLinearSlopes)
Code
AIC(pointsPerSeason_positionAgeFixedLinearSlopes)
[1] 71242.93
Code
pointsPerSeason_positionAgeRandomLinearSlopes <- lmerTest::lmer(
  fantasy_points ~ positionFactor + ageCentered20 + (1 + ageCentered20 | player_idFactor),
  data = player_stats_seasonal_offense_subset,
  REML = FALSE,
  control = lmerControl(optimizer = "bobyqa")
)

summary(pointsPerSeason_positionAgeRandomLinearSlopes)
Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's
  method [lmerModLmerTest]
Formula: 
fantasy_points ~ positionFactor + ageCentered20 + (1 + ageCentered20 |  
    player_idFactor)
   Data: player_stats_seasonal_offense_subset
Control: lmerControl(optimizer = "bobyqa")

     AIC      BIC   logLik deviance df.resid 
 71019.5  71087.2 -35499.8  70999.5     6435 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-6.4021 -0.4324 -0.1166  0.3901  4.7118 

Random effects:
 Groups          Name          Variance Std.Dev. Corr 
 player_idFactor (Intercept)   3483.32  59.020        
                 ageCentered20   28.04   5.296   -0.52
 Residual                      2430.63  49.301        
Number of obs: 6445, groups:  player_idFactor, 1292

Fixed effects:
                  Estimate Std. Error        df t value Pr(>|t|)    
(Intercept)        24.9438     8.8723 1459.3714   2.811 0.004998 ** 
positionFactorQB   88.9525     9.7308 1297.2928   9.141  < 2e-16 ***
positionFactorRB   47.3942     9.2153 1322.7062   5.143 3.11e-07 ***
positionFactorTE   16.3712     9.3043 1314.3615   1.760 0.078720 .  
positionFactorWR   34.2219     9.0054 1318.7078   3.800 0.000151 ***
ageCentered20      -2.0167     0.3462  716.2648  -5.826 8.59e-09 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) pstFQB pstFRB pstFTE pstFWR
postnFctrQB -0.860                            
postnFctrRB -0.918  0.828                     
postnFctrTE -0.904  0.820  0.867              
postnFctrWR -0.938  0.848  0.896  0.887       
ageCentrd20 -0.237 -0.001  0.039  0.017  0.033
Code
MuMIn::r.squaredGLMM(pointsPerSeason_positionAgeRandomLinearSlopes)
            R2m       R2c
[1,] 0.09797596 0.5835368
Code
emmeans::emmeans(pointsPerSeason_positionAgeRandomLinearSlopes, "positionFactor")
 positionFactor emmean   SE   df lower.CL upper.CL
 FB               12.0 8.64 1189    -4.92     29.0
 QB              101.0 4.53 1151    92.10    109.9
 RB               59.4 3.28 1288    52.99     65.9
 TE               28.4 3.52 1242    21.50     35.3
 WR               46.3 2.63 1321    41.10     51.4

Degrees-of-freedom method: kenward-roger 
Confidence level used: 0.95 
Code
emmeans::emmeans(pointsPerSeason_positionAgeRandomLinearSlopes, "ageCentered20")
 ageCentered20 emmean   SE   df lower.CL upper.CL
           6.4   49.4 2.25 1168       45     53.8

Results are averaged over the levels of: positionFactor 
Degrees-of-freedom method: kenward-roger 
Confidence level used: 0.95 
Code
performance::icc(pointsPerSeason_positionAgeRandomLinearSlopes)
Code
AIC(pointsPerSeason_positionAgeRandomLinearSlopes)
[1] 71019.5
11.3.3.3.2 Quadratic Models
Code
pointsPerSeason_positionAgeRandomLinearFixedQuadraticSlopes <- lmerTest::lmer(
  fantasy_points ~ positionFactor + ageCentered20 + ageCentered20Quadratic + (1 + ageCentered20 | player_idFactor),
  data = player_stats_seasonal_offense_subset,
  REML = FALSE,
  control = lmerControl(optimizer = "bobyqa")
)

summary(pointsPerSeason_positionAgeRandomLinearFixedQuadraticSlopes)
Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's
  method [lmerModLmerTest]
Formula: 
fantasy_points ~ positionFactor + ageCentered20 + ageCentered20Quadratic +  
    (1 + ageCentered20 | player_idFactor)
   Data: player_stats_seasonal_offense_subset
Control: lmerControl(optimizer = "bobyqa")

     AIC      BIC   logLik deviance df.resid 
 70665.5  70740.0 -35321.7  70643.5     6434 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-6.6477 -0.4430 -0.1063  0.3817  4.7823 

Random effects:
 Groups          Name          Variance Std.Dev. Corr 
 player_idFactor (Intercept)   4050.7   63.645        
                 ageCentered20   61.2    7.823   -0.60
 Residual                      2157.5   46.449        
Number of obs: 6445, groups:  player_idFactor, 1292

Fixed effects:
                         Estimate Std. Error         df t value Pr(>|t|)    
(Intercept)             -24.02735    9.34231 1728.76809  -2.572   0.0102 *  
positionFactorQB         88.81757    9.85929 1344.37099   9.009  < 2e-16 ***
positionFactorRB         51.31903    9.32501 1359.21221   5.503 4.45e-08 ***
positionFactorTE         16.83493    9.41871 1353.13565   1.787   0.0741 .  
positionFactorWR         36.99633    9.11732 1357.91920   4.058 5.24e-05 ***
ageCentered20            14.28911    0.89964 4453.56443  15.883  < 2e-16 ***
ageCentered20Quadratic   -1.24457    0.05996 3926.72121 -20.756  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) pstFQB pstFRB pstFTE pstFWR agCn20
postnFctrQB -0.828                                   
postnFctrRB -0.889  0.830                            
postnFctrTE -0.872  0.822  0.869                     
postnFctrWR -0.907  0.849  0.899  0.889              
ageCentrd20 -0.340 -0.004  0.032  0.011  0.027       
agCntrd20Qd  0.259  0.007 -0.017 -0.005 -0.014 -0.893
Code
MuMIn::r.squaredGLMM(pointsPerSeason_positionAgeRandomLinearFixedQuadraticSlopes)
           R2m       R2c
[1,] 0.1655593 0.6746525
Code
emmeans::emmeans(pointsPerSeason_positionAgeRandomLinearFixedQuadraticSlopes, "positionFactor")
 positionFactor emmean   SE   df lower.CL upper.CL
 FB               3.38 8.77 1236    -13.8     20.6
 QB              92.19 4.62 1223     83.1    101.3
 RB              54.70 3.33 1334     48.2     61.2
 TE              20.21 3.58 1293     13.2     27.2
 WR              40.37 2.68 1386     35.1     45.6

Degrees-of-freedom method: kenward-roger 
Confidence level used: 0.95 
Code
emmeans::emmeans(pointsPerSeason_positionAgeRandomLinearFixedQuadraticSlopes, "ageCentered20")
 ageCentered20 emmean   SE   df lower.CL upper.CL
           6.4   42.2 2.33 1234     37.6     46.7

Results are averaged over the levels of: positionFactor 
Degrees-of-freedom method: kenward-roger 
Confidence level used: 0.95 
Code
emmeans::emmeans(pointsPerSeason_positionAgeRandomLinearFixedQuadraticSlopes, "ageCentered20Quadratic")
 ageCentered20Quadratic emmean   SE   df lower.CL upper.CL
                   51.5   42.2 2.33 1234     37.6     46.7

Results are averaged over the levels of: positionFactor 
Degrees-of-freedom method: kenward-roger 
Confidence level used: 0.95 
Code
performance::icc(pointsPerSeason_positionAgeRandomLinearFixedQuadraticSlopes)
Code
AIC(pointsPerSeason_positionAgeRandomLinearFixedQuadraticSlopes)
[1] 70665.47
Code
pointsPerSeason_positionAgeRandomLinearRandomQuadraticSlopes <- lmerTest::lmer(
  fantasy_points ~ positionFactor + ageCentered20 + ageCentered20Quadratic + (1 + ageCentered20 + ageCentered20Quadratic | player_idFactor),
  data = player_stats_seasonal_offense_subset,
  REML = FALSE,
  control = lmerControl(optimizer = "bobyqa")
)

pointsPerSeason_positionAgeRandomLinearFixedQuadraticSlopesInteraction <- lmerTest::lmer(
  fantasy_points ~ positionFactor + ageCentered20 + ageCentered20Quadratic + positionFactor:ageCentered20 + positionFactor:ageCentered20Quadratic + (1 + ageCentered20 | player_idFactor),
  data = player_stats_seasonal_offense_subset,
  REML = FALSE,
  control = lmerControl(optimizer = "bobyqa")
)

summary(pointsPerSeason_positionAgeRandomLinearFixedQuadraticSlopesInteraction)
Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's
  method [lmerModLmerTest]
Formula: 
fantasy_points ~ positionFactor + ageCentered20 + ageCentered20Quadratic +  
    positionFactor:ageCentered20 + positionFactor:ageCentered20Quadratic +  
    (1 + ageCentered20 | player_idFactor)
   Data: player_stats_seasonal_offense_subset
Control: lmerControl(optimizer = "bobyqa")

     AIC      BIC   logLik deviance df.resid 
 70621.8  70750.5 -35291.9  70583.8     6426 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-6.7812 -0.4499 -0.1064  0.3837  4.7504 

Random effects:
 Groups          Name          Variance Std.Dev. Corr 
 player_idFactor (Intercept)   3874.28  62.24         
                 ageCentered20   56.85   7.54    -0.57
 Residual                      2136.45  46.22         
Number of obs: 6445, groups:  player_idFactor, 1292

Fixed effects:
                                          Estimate Std. Error         df
(Intercept)                              -23.02774   26.67799 4427.31960
positionFactorQB                          65.80803   28.07816 4194.19857
positionFactorRB                          62.12817   27.68236 4329.35668
positionFactorTE                          20.05015   27.89862 4326.89022
positionFactorWR                          29.59854   27.35775 4370.07994
ageCentered20                             11.79540    7.43993 4959.04172
ageCentered20Quadratic                    -0.87148    0.50915 4523.63824
positionFactorQB:ageCentered20             7.19239    7.65334 4914.64310
positionFactorRB:ageCentered20             1.31528    7.76909 4927.20674
positionFactorTE:ageCentered20            -0.85010    7.73970 4938.17986
positionFactorWR:ageCentered20             5.36603    7.64413 4949.66508
positionFactorQB:ageCentered20Quadratic   -0.49079    0.51719 4542.27913
positionFactorRB:ageCentered20Quadratic   -0.61350    0.53800 4439.78564
positionFactorTE:ageCentered20Quadratic    0.06118    0.52899 4492.88066
positionFactorWR:ageCentered20Quadratic   -0.65597    0.52521 4496.71285
                                        t value Pr(>|t|)  
(Intercept)                              -0.863   0.3881  
positionFactorQB                          2.344   0.0191 *
positionFactorRB                          2.244   0.0249 *
positionFactorTE                          0.719   0.4724  
positionFactorWR                          1.082   0.2794  
ageCentered20                             1.585   0.1129  
ageCentered20Quadratic                   -1.712   0.0870 .
positionFactorQB:ageCentered20            0.940   0.3474  
positionFactorRB:ageCentered20            0.169   0.8656  
positionFactorTE:ageCentered20           -0.110   0.9125  
positionFactorWR:ageCentered20            0.702   0.4827  
positionFactorQB:ageCentered20Quadratic  -0.949   0.3427  
positionFactorRB:ageCentered20Quadratic  -1.140   0.2542  
positionFactorTE:ageCentered20Quadratic   0.116   0.9079  
positionFactorWR:ageCentered20Quadratic  -1.249   0.2117  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Code
MuMIn::r.squaredGLMM(pointsPerSeason_positionAgeRandomLinearFixedQuadraticSlopesInteraction)
           R2m       R2c
[1,] 0.1582759 0.6751097
Code
emmeans::emmeans(pointsPerSeason_positionAgeRandomLinearFixedQuadraticSlopesInteraction, "positionFactor")
 positionFactor emmean   SE   df lower.CL upper.CL
 FB               7.62 9.43 1344    -10.9     26.1
 QB              94.20 4.73 1111     84.9    103.5
 RB              46.59 3.71 1375     39.3     53.9
 TE              25.37 3.78 1247     18.0     32.8
 WR              37.80 2.89 1333     32.1     43.5

Degrees-of-freedom method: kenward-roger 
Confidence level used: 0.95 
Code
emmeans::emmeans(pointsPerSeason_positionAgeRandomLinearFixedQuadraticSlopesInteraction, "ageCentered20")
 ageCentered20 emmean   SE   df lower.CL upper.CL
           6.4   42.3 2.43 1301     37.5     47.1

Results are averaged over the levels of: positionFactor 
Degrees-of-freedom method: kenward-roger 
Confidence level used: 0.95 
Code
emmeans::emmeans(pointsPerSeason_positionAgeRandomLinearFixedQuadraticSlopes, "ageCentered20Quadratic")
 ageCentered20Quadratic emmean   SE   df lower.CL upper.CL
                   51.5   42.2 2.33 1234     37.6     46.7

Results are averaged over the levels of: positionFactor 
Degrees-of-freedom method: kenward-roger 
Confidence level used: 0.95 
Code
performance::icc(pointsPerSeason_positionAgeRandomLinearFixedQuadraticSlopesInteraction)
Code
AIC(pointsPerSeason_positionAgeRandomLinearFixedQuadraticSlopesInteraction)
[1] 70621.85
Code
pointsPerSeason_positionAgeRandomLinearFixedQuadraticSlopesInteractionExperience <- lmerTest::lmer(
  fantasy_points ~ positionFactor + ageCentered20 + ageCentered20Quadratic + positionFactor:ageCentered20 + positionFactor:ageCentered20Quadratic + years_of_experience + (1 + ageCentered20 | player_idFactor),
  data = player_stats_seasonal_offense_subset,
  REML = FALSE,
  control = lmerControl(optimizer = "bobyqa")
)

summary(pointsPerSeason_positionAgeRandomLinearFixedQuadraticSlopesInteractionExperience)
Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's
  method [lmerModLmerTest]
Formula: 
fantasy_points ~ positionFactor + ageCentered20 + ageCentered20Quadratic +  
    positionFactor:ageCentered20 + positionFactor:ageCentered20Quadratic +  
    years_of_experience + (1 + ageCentered20 | player_idFactor)
   Data: player_stats_seasonal_offense_subset
Control: lmerControl(optimizer = "bobyqa")

     AIC      BIC   logLik deviance df.resid 
 70592.4  70727.8 -35276.2  70552.4     6425 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-6.7526 -0.4415 -0.1012  0.3834  4.7239 

Random effects:
 Groups          Name          Variance Std.Dev. Corr 
 player_idFactor (Intercept)   3856.96  62.104        
                 ageCentered20   55.69   7.463   -0.58
 Residual                      2140.80  46.269        
Number of obs: 6445, groups:  player_idFactor, 1292

Fixed effects:
                                          Estimate Std. Error         df
(Intercept)                              -14.41747   26.71450 4443.49841
positionFactorQB                          61.85599   28.07863 4206.42470
positionFactorRB                          59.26821   27.68023 4341.88102
positionFactorTE                          18.66483   27.89324 4337.42684
positionFactorWR                          27.05937   27.35504 4381.59101
ageCentered20                              6.92672    7.48505 5052.03481
ageCentered20Quadratic                    -0.89662    0.50892 4506.71231
years_of_experience                        5.98527    1.05344 2059.25177
positionFactorQB:ageCentered20             7.15606    7.64964 4911.53510
positionFactorRB:ageCentered20             1.41150    7.76561 4923.15762
positionFactorTE:ageCentered20            -0.95441    7.73630 4933.84264
positionFactorWR:ageCentered20             5.38514    7.64072 4945.56253
positionFactorQB:ageCentered20Quadratic   -0.47891    0.51694 4525.86739
positionFactorRB:ageCentered20Quadratic   -0.62677    0.53771 4423.94494
positionFactorTE:ageCentered20Quadratic    0.06362    0.52872 4475.97907
positionFactorWR:ageCentered20Quadratic   -0.65946    0.52493 4480.80773
                                        t value Pr(>|t|)    
(Intercept)                              -0.540   0.5894    
positionFactorQB                          2.203   0.0277 *  
positionFactorRB                          2.141   0.0323 *  
positionFactorTE                          0.669   0.5034    
positionFactorWR                          0.989   0.3226    
ageCentered20                             0.925   0.3548    
ageCentered20Quadratic                   -1.762   0.0782 .  
years_of_experience                       5.682 1.52e-08 ***
positionFactorQB:ageCentered20            0.935   0.3496    
positionFactorRB:ageCentered20            0.182   0.8558    
positionFactorTE:ageCentered20           -0.123   0.9018    
positionFactorWR:ageCentered20            0.705   0.4810    
positionFactorQB:ageCentered20Quadratic  -0.926   0.3543    
positionFactorRB:ageCentered20Quadratic  -1.166   0.2438    
positionFactorTE:ageCentered20Quadratic   0.120   0.9042    
positionFactorWR:ageCentered20Quadratic  -1.256   0.2091    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Code
MuMIn::r.squaredGLMM(pointsPerSeason_positionAgeRandomLinearFixedQuadraticSlopesInteractionExperience)
           R2m       R2c
[1,] 0.1650522 0.6694687
Code
emmeans::emmeans(pointsPerSeason_positionAgeRandomLinearFixedQuadraticSlopesInteractionExperience, "positionFactor")
 positionFactor emmean   SE   df lower.CL upper.CL
 FB               11.3 9.30 1351    -6.97     29.5
 QB               94.3 4.65 1109    85.17    103.4
 RB               47.3 3.65 1374    40.17     54.5
 TE               27.1 3.73 1250    19.80     34.4
 WR               38.9 2.85 1331    33.28     44.5

Degrees-of-freedom method: kenward-roger 
Confidence level used: 0.95 
Code
emmeans::emmeans(pointsPerSeason_positionAgeRandomLinearFixedQuadraticSlopesInteractionExperience, "ageCentered20")
 ageCentered20 emmean  SE   df lower.CL upper.CL
           6.4   43.8 2.4 1308     39.1     48.5

Results are averaged over the levels of: positionFactor 
Degrees-of-freedom method: kenward-roger 
Confidence level used: 0.95 
Code
emmeans::emmeans(pointsPerSeason_positionAgeRandomLinearFixedQuadraticSlopesInteractionExperience, "ageCentered20Quadratic")
 ageCentered20Quadratic emmean  SE   df lower.CL upper.CL
                   51.5   43.8 2.4 1308     39.1     48.5

Results are averaged over the levels of: positionFactor 
Degrees-of-freedom method: kenward-roger 
Confidence level used: 0.95 
Code
emmeans::emmeans(pointsPerSeason_positionAgeRandomLinearFixedQuadraticSlopesInteractionExperience, "years_of_experience")
 years_of_experience emmean  SE   df lower.CL upper.CL
                 4.6   43.8 2.4 1308     39.1     48.5

Results are averaged over the levels of: positionFactor 
Degrees-of-freedom method: kenward-roger 
Confidence level used: 0.95 
Code
performance::icc(pointsPerSeason_positionAgeRandomLinearFixedQuadraticSlopesInteractionExperience)
Code
AIC(pointsPerSeason_positionAgeRandomLinearFixedQuadraticSlopesInteractionExperience)
[1] 70592.39
11.3.3.3.3 Compare Models
Code
anova(
  pointsPerSeason_nullModel,
  pointsPerSeason_position
)
Code
anova(
  pointsPerSeason_positionAgeFixedLinearSlopes,
  pointsPerSeason_positionAgeRandomLinearSlopes,
  pointsPerSeason_positionAgeRandomLinearFixedQuadraticSlopes,
  pointsPerSeason_positionAgeRandomLinearFixedQuadraticSlopesInteraction
)
Code
mixedModels <- list(
  "null" = pointsPerSeason_nullModel,
  "position" = pointsPerSeason_position,
  "fixedLinear" = pointsPerSeason_positionAgeFixedLinearSlopes,
  "randomLinear" = pointsPerSeason_positionAgeRandomLinearSlopes,
  "randomLinearFixedQuadratic" = pointsPerSeason_positionAgeRandomLinearFixedQuadraticSlopes,
  "randomLinearFixedQuadraticInteraction" = pointsPerSeason_positionAgeRandomLinearFixedQuadraticSlopesInteraction
)

aictab(mixedModels)
11.3.3.3.4 Generalized Additive Model
Code
num_cores <- detectCores()
num_cores
[1] 4
Code
pointsPerSeason_gam <- bam( # using bam() instead of gam() for faster estimation due to large size of data
  fantasy_points ~ positionFactor + s(ageCentered20, by = positionFactor) + years_of_experience + s(player_idFactor, ageCentered20, bs = "re"),
  data = player_stats_seasonal_offense_subset,
  nthreads = num_cores
)

pointsPerSeason_gamSummary <- summary(pointsPerSeason_gam)

pointsPerSeason_gamSummary

Family: gaussian 
Link function: identity 

Formula:
fantasy_points ~ positionFactor + s(ageCentered20, by = positionFactor) + 
    years_of_experience + s(player_idFactor, ageCentered20, bs = "re")

Parametric coefficients:
                    Estimate Std. Error t value Pr(>|t|)    
(Intercept)          -7.1108    10.2247  -0.695  0.48680    
positionFactorQB     87.7506    10.7149   8.190 3.23e-16 ***
positionFactorRB     28.7960    10.2812   2.801  0.00511 ** 
positionFactorTE     12.6026    10.2924   1.224  0.22083    
positionFactorWR     22.6299     9.9928   2.265  0.02357 *  
years_of_experience   4.2443     0.9271   4.578 4.80e-06 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
                                      edf   Ref.df      F p-value    
s(ageCentered20):positionFactorFB   1.761    2.221  2.488  0.0735 .  
s(ageCentered20):positionFactorQB   5.143    6.310 34.597  <2e-16 ***
s(ageCentered20):positionFactorRB   4.892    5.905 35.700  <2e-16 ***
s(ageCentered20):positionFactorTE   4.142    5.146 14.293  <2e-16 ***
s(ageCentered20):positionFactorWR   5.288    6.300 39.727  <2e-16 ***
s(player_idFactor,ageCentered20)  880.138 1287.000  5.070  <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =  0.583   Deviance explained = 64.1%
fREML =  35673  Scale est. = 2784.9    n = 6445
Code
pointsPerSeason_gamSummary$r.sq
[1] 0.5828168
Code
MuMIn::r.squaredGLMM(pointsPerSeason_gam)
           R2m       R2c
[1,] 0.5574201 0.5574201
Code
AIC(pointsPerSeason_gam)
[1] 70254.43
11.3.3.3.5 Players Who Were (at Least Once) at the Top of the End-of-Season Depth Chart
Code
pointsPerSeasonDepth_positionAgeRandomLinearFixedQuadraticSlopesInteractionExperience <- lmerTest::lmer(
  fantasy_points ~ positionFactor + ageCentered20 + ageCentered20Quadratic + positionFactor:ageCentered20 + positionFactor:ageCentered20Quadratic + years_of_experience + (1 + ageCentered20 | player_idFactor),
  data = player_stats_seasonal_offense_subset,
  control = lmerControl(optimizer = "bobyqa")
)

summary(pointsPerSeasonDepth_positionAgeRandomLinearFixedQuadraticSlopesInteractionExperience)
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: 
fantasy_points ~ positionFactor + ageCentered20 + ageCentered20Quadratic +  
    positionFactor:ageCentered20 + positionFactor:ageCentered20Quadratic +  
    years_of_experience + (1 + ageCentered20 | player_idFactor)
   Data: player_stats_seasonal_offense_subset
Control: lmerControl(optimizer = "bobyqa")

REML criterion at convergence: 70526.5

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-6.7527 -0.4422 -0.1011  0.3827  4.7192 

Random effects:
 Groups          Name          Variance Std.Dev. Corr 
 player_idFactor (Intercept)   3892.71  62.392        
                 ageCentered20   56.65   7.527   -0.58
 Residual                      2142.60  46.288        
Number of obs: 6445, groups:  player_idFactor, 1292

Fixed effects:
                                          Estimate Std. Error         df
(Intercept)                              -14.50932   26.75909 4424.09802
positionFactorQB                          61.83156   28.12724 4186.26252
positionFactorRB                          59.23635   27.72713 4322.39098
positionFactorTE                          18.69491   27.94051 4317.87428
positionFactorWR                          27.03245   27.40111 4362.14111
ageCentered20                              6.96089    7.49658 5041.41649
ageCentered20Quadratic                    -0.89892    0.50977 4503.24842
years_of_experience                        5.97746    1.05605 2050.13256
positionFactorQB:ageCentered20             7.16792    7.66153 4900.34818
positionFactorRB:ageCentered20             1.43053    7.77761 4912.87816
positionFactorTE:ageCentered20            -0.96261    7.74822 4923.54745
positionFactorWR:ageCentered20             5.40171    7.65247 4935.17920
positionFactorQB:ageCentered20Quadratic   -0.48061    0.51780 4522.63658
positionFactorRB:ageCentered20Quadratic   -0.62953    0.53863 4421.51275
positionFactorTE:ageCentered20Quadratic    0.06391    0.52961 4473.57721
positionFactorWR:ageCentered20Quadratic   -0.66176    0.52582 4477.90547
                                        t value Pr(>|t|)    
(Intercept)                              -0.542   0.5877    
positionFactorQB                          2.198   0.0280 *  
positionFactorRB                          2.136   0.0327 *  
positionFactorTE                          0.669   0.5035    
positionFactorWR                          0.987   0.3239    
ageCentered20                             0.929   0.3532    
ageCentered20Quadratic                   -1.763   0.0779 .  
years_of_experience                       5.660 1.72e-08 ***
positionFactorQB:ageCentered20            0.936   0.3495    
positionFactorRB:ageCentered20            0.184   0.8541    
positionFactorTE:ageCentered20           -0.124   0.9011    
positionFactorWR:ageCentered20            0.706   0.4803    
positionFactorQB:ageCentered20Quadratic  -0.928   0.3534    
positionFactorRB:ageCentered20Quadratic  -1.169   0.2426    
positionFactorTE:ageCentered20Quadratic   0.121   0.9040    
positionFactorWR:ageCentered20Quadratic  -1.259   0.2083    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Code
MuMIn::r.squaredGLMM(pointsPerSeasonDepth_positionAgeRandomLinearFixedQuadraticSlopesInteractionExperience)
           R2m       R2c
[1,] 0.1648185 0.6708484
Code
emmeans::emmeans(pointsPerSeasonDepth_positionAgeRandomLinearFixedQuadraticSlopesInteractionExperience, "positionFactor")
 positionFactor emmean   SE   df lower.CL upper.CL
 FB               11.3 9.30 1333    -6.99     29.5
 QB               94.2 4.65 1094    85.11    103.4
 RB               47.3 3.65 1356    40.10     54.4
 TE               27.1 3.73 1234    19.77     34.4
 WR               38.8 2.85 1313    33.22     44.4

Degrees-of-freedom method: kenward-roger 
Confidence level used: 0.95 
Code
emmeans::emmeans(pointsPerSeasonDepth_positionAgeRandomLinearFixedQuadraticSlopesInteractionExperience, "ageCentered20")
 ageCentered20 emmean  SE   df lower.CL upper.CL
           6.4   43.7 2.4 1290       39     48.4

Results are averaged over the levels of: positionFactor 
Degrees-of-freedom method: kenward-roger 
Confidence level used: 0.95 
Code
emmeans::emmeans(pointsPerSeasonDepth_positionAgeRandomLinearFixedQuadraticSlopesInteractionExperience, "ageCentered20Quadratic")
 ageCentered20Quadratic emmean  SE   df lower.CL upper.CL
                   51.5   43.7 2.4 1290       39     48.4

Results are averaged over the levels of: positionFactor 
Degrees-of-freedom method: kenward-roger 
Confidence level used: 0.95 
Code
emmeans::emmeans(pointsPerSeasonDepth_positionAgeRandomLinearFixedQuadraticSlopesInteractionExperience, "years_of_experience")
 years_of_experience emmean  SE   df lower.CL upper.CL
                 4.6   43.7 2.4 1290       39     48.4

Results are averaged over the levels of: positionFactor 
Degrees-of-freedom method: kenward-roger 
Confidence level used: 0.95 
Code
performance::icc(pointsPerSeasonDepth_positionAgeRandomLinearFixedQuadraticSlopesInteractionExperience)
Code
AIC(pointsPerSeasonDepth_positionAgeRandomLinearFixedQuadraticSlopesInteractionExperience)
[1] 70566.51
Code
pointsPerSeasonDepth_gam <- bam( # using bam() instead of gam() for faster estimation due to large size of data
  fantasy_points ~ positionFactor + s(ageCentered20, by = positionFactor) + years_of_experience + s(player_idFactor, ageCentered20, bs = "re"),
  data = player_stats_seasonal_offense_subsetDepth,
  nthreads = num_cores
)

pointsPerSeasonDepth_gamSummary <- summary(pointsPerSeasonDepth_gam)

pointsPerSeasonDepth_gamSummary

Family: gaussian 
Link function: identity 

Formula:
fantasy_points ~ positionFactor + s(ageCentered20, by = positionFactor) + 
    years_of_experience + s(player_idFactor, ageCentered20, bs = "re")

Parametric coefficients:
                    Estimate Std. Error t value Pr(>|t|)    
(Intercept)            6.464     11.895   0.543 0.586866    
positionFactorQB     112.659     12.225   9.215  < 2e-16 ***
positionFactorRB      52.338     11.810   4.431 9.58e-06 ***
positionFactorTE      27.058     11.954   2.264 0.023648 *  
positionFactorWR      41.742     11.430   3.652 0.000263 ***
years_of_experience    1.072      1.181   0.908 0.363798    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
                                      edf  Ref.df      F  p-value    
s(ageCentered20):positionFactorFB   1.554   1.931  0.635    0.461    
s(ageCentered20):positionFactorQB   4.974   6.127 19.094  < 2e-16 ***
s(ageCentered20):positionFactorRB   4.149   5.127 18.527  < 2e-16 ***
s(ageCentered20):positionFactorTE   3.761   4.711  7.432 2.17e-06 ***
s(ageCentered20):positionFactorWR   4.783   5.802 22.841  < 2e-16 ***
s(player_idFactor,ageCentered20)  600.644 780.000  6.092  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =  0.564   Deviance explained = 61.7%
fREML =  28625  Scale est. = 3193.5    n = 5121
Code
pointsPerSeasonDepth_gamSummary$r.sq
[1] 0.564187
Code
MuMIn::r.squaredGLMM(pointsPerSeasonDepth_gam)
           R2m       R2c
[1,] 0.5405308 0.5405308
Code
AIC(pointsPerSeasonDepth_gam)
[1] 56444.14

11.3.4 Plots of Model-Implied Fantasy Points by Position and Age

Code
# Create newdata object
pointsPerSeason_positionAge_newData <- expand.grid(
  positionFactor = factor(c("FB","QB","RB","TE","WR")), #,"K"
  age = seq(from = 20, to = 40, length.out = 10000)
)

pointsPerSeason_positionAge_newData$ageCentered20 <- pointsPerSeason_positionAge_newData$age - 20
pointsPerSeason_positionAge_newData$ageCentered20Quadratic <- pointsPerSeason_positionAge_newData$ageCentered20 ^ 2
pointsPerSeason_positionAge_newData$years_of_experience <- floor(pointsPerSeason_positionAge_newData$age - 22) # assuming that most players start at age 22 (i.e., rookie year) and thus have 1 year of experience at age 23
pointsPerSeason_positionAge_newData$years_of_experience[which(pointsPerSeason_positionAge_newData$years_of_experience < 0)] <- 0

# From Quadratic Model: All Players
pointsPerSeason_positionAge_newData$fantasyPoints_quadratic <- predict(
  object = pointsPerSeason_positionAgeRandomLinearFixedQuadraticSlopesInteractionExperience,
  newdata = pointsPerSeason_positionAge_newData,
  re.form = NA
)

# From Quadratic Model: Players at Top of End-of-Season Depth Chart
pointsPerSeason_positionAge_newData$fantasyPoints_depthQuadratic <- predict(
  object = pointsPerSeason_positionAgeRandomLinearFixedQuadraticSlopesInteractionExperience,
  newdata = pointsPerSeason_positionAge_newData,
  re.form = NA
)

# From GAM Model: All Players
pointsPerSeason_positionAge_newData$fantasyPoints_gam <- predict(
  object = pointsPerSeason_gam,
  newdata = pointsPerSeason_positionAge_newData,
  newdata.guaranteed = TRUE,
  exclude = "s(player_idFactor,ageCentered20)"
)

# From GAM Model: Players at Top of End-of-Season Depth Chart
pointsPerSeason_positionAge_newData$fantasyPoints_depthGAM <- predict(
  object = pointsPerSeasonDepth_gam,
  newdata = pointsPerSeason_positionAge_newData,
  newdata.guaranteed = TRUE,
  exclude = "s(player_idFactor,ageCentered20)"
)

Plots of model-implied fantasy points by position and age are in Figures 11.1111.14.

11.3.4.1 Quadratic Model

Code
ggplot2::ggplot(
  data = pointsPerSeason_positionAge_newData,
  mapping = aes(
    x = age,
    y = fantasyPoints_quadratic,
    color = positionFactor
  )
) + 
  geom_smooth() +
  labs(
    x = "Player Age (years)",
    y = "Fantasy Points (Season)",
    title = "Fantasy Points (Season) by Player Age and Position",
    subtitle = "Quadratic Model with All Players"
  ) +
  theme_classic()
Figure 11.11: Plot of Model-Implied Quadratic Trajectories of Fantasy Points by Age.

11.3.4.2 Quadratic Model: Top of Depth Chart

Code
ggplot2::ggplot(
  data = pointsPerSeason_positionAge_newData,
  mapping = aes(
    x = age,
    y = fantasyPoints_depthQuadratic,
    color = positionFactor
  )
) + 
  geom_smooth() +
  labs(
    x = "Player Age (years)",
    y = "Fantasy Points (Season)",
    title = "Fantasy Points (Season) by Player Age and Position",
    subtitle = "Quadratic Model with Players Who Were Once at Top of End-of-Season Depth Chart"
  ) +
  theme_classic()
Figure 11.12: Plot of Model-Implied Quadratic Trajectories of Fantasy Points by Age For Players Who Were Once at the Top of the End-of-Season Depth Chart.

11.3.4.3 Generalized Additive Model

Code
ggplot2::ggplot(
  data = pointsPerSeason_positionAge_newData,
  mapping = aes(
    x = age,
    y = fantasyPoints_gam,
    color = positionFactor
  )
) + 
  geom_smooth() +
  labs(
    x = "Player Age (years)",
    y = "Fantasy Points (Season)",
    title = "Fantasy Points (Season) by Player Age and Position",
    subtitle = "Generalized Additive Model with All Players"
  ) +
  theme_classic()
Figure 11.13: Plot of Implied Trajectories of Fantasy Points by Age from a Generalized Additive Model.

11.3.4.4 Generalized Additive Model: Top of Depth Chart

Code
ggplot2::ggplot(
  data = pointsPerSeason_positionAge_newData,
  mapping = aes(
    x = age,
    y = fantasyPoints_depthGAM,
    color = positionFactor
  )
) + 
  geom_smooth() +
  labs(
    x = "Player Age (years)",
    y = "Fantasy Points (Season)",
    title = "Fantasy Points (Season) by Player Age and Position",
    subtitle = "Generalized Additive Model with Players Who Were Once at Top of End-of-Season Depth Chart"
  ) +
  theme_classic()
Figure 11.14: Plot of Implied Trajectories of Fantasy Points by Age, from a Generalized Additive Model, For Players Who Were Once at the Top of the End-of-Season Depth Chart.

11.3.5 Plots of Individuals’ Model-Implied Fantasy Points by Age

Code
player_stats_seasonal_offense_subsetCC <- player_stats_seasonal_offense_subset %>%
  filter(!is.na(player_idFactor), !is.na(fantasy_points), !is.na(positionFactor), !is.na(ageCentered20), !is.na(ageCentered20Quadratic), !is.na(years_of_experience))

player_stats_seasonal_offense_subsetCC$fantasyPoints_quadratic <- predict(
  object = pointsPerSeason_positionAgeRandomLinearFixedQuadraticSlopesInteractionExperience,
  newdata = player_stats_seasonal_offense_subsetCC
)

player_stats_seasonal_offense_subsetCC$fantasyPoints_gam <- predict(
  object = pointsPerSeason_gam,
  newdata = player_stats_seasonal_offense_subsetCC
)
Code
zeroAge <- pointsPerSeason_positionAge_newData %>% 
  group_by(positionFactor) %>% 
  filter(fantasyPoints_gam < 0) %>% 
  slice(which.min(age))

peakAge <- pointsPerSeason_positionAge_newData %>% 
  group_by(positionFactor) %>% 
  slice(which.max(fantasyPoints_gam))

peakAge2 <- pointsPerSeason_positionAge_newData %>% 
  filter(age > 22) %>% 
  group_by(positionFactor) %>% 
  slice(which.max(fantasyPoints_gam))

qbPeakAge <- round(peakAge$age[which(peakAge$positionFactor == "QB")], 0)
fbPeakAge <- round(peakAge$age[which(peakAge$positionFactor == "FB")], 0)
rbPeakAge <- round(peakAge$age[which(peakAge$positionFactor == "RB")], 0)
wrPeakAge <- round(peakAge$age[which(peakAge$positionFactor == "WR")], 0)
wrPeakAge2 <- round(peakAge2$age[which(peakAge$positionFactor == "WR")], 0)
tePeakAge <- round(peakAge$age[which(peakAge$positionFactor == "TE")], 0)

qbZeroAge <- round(zeroAge$age[which(zeroAge$positionFactor == "QB")], 0)
fbZeroAge <- round(zeroAge$age[which(zeroAge$positionFactor == "FB")], 0)
rbZeroAge <- round(zeroAge$age[which(zeroAge$positionFactor == "RB")], 0)
wrZeroAge <- round(zeroAge$age[which(zeroAge$positionFactor == "WR")], 0)
teZeroAge <- round(zeroAge$age[which(zeroAge$positionFactor == "TE")], 0)

11.3.5.1 Quarterbacks

A plot of Quarterbacks’ model-implied fantasy points by age is in Figure 11.15. The model-implied peak of Quarterbacks’ fantasy points is at age 20. The model-predicted value of zero fantasy points for Quarterbacks is at 36.

Code
plot_individualFantasyPointsByAgeQB <- ggplot(
  data = player_stats_seasonal_offense_subsetCC %>% filter(position == "QB"),
  mapping = aes(
    x = age,
    y = fantasyPoints_gam,
    group = player_id)) +
  geom_smooth(
    aes(
      x = age,
      y = fantasyPoints_gam,
      text = player_display_name, # add player name for mouse over tooltip
      label = season # add season for mouse over tooltip
    ),
    se = FALSE,
    linewidth = 0.5,
    color = "black") +
  geom_smooth(
    mapping = aes(
      x = age,
      y = fantasyPoints_gam
    ),
    data = pointsPerSeason_positionAge_newData %>% filter(positionFactor == "QB"),
    inherit.aes = FALSE,
    se = TRUE,
    linewidth = 2
  ) +
  geom_point(
    aes(
      x = age,
      y = fantasyPoints_gam,
      text = player_display_name, # add player name for mouse over tooltip
      label = season # add season for mouse over tooltip
    ),
    size = 1,
    color = "transparent" # make points invisible but keep tooltips
  ) +
  labs(
    x = "Player Age (years)",
    y = "Fantasy Points (Season)",
    title = "Fantasy Points (Season) by Player Age: Quarterbacks"
  ) +
  theme_classic()

ggplotly(
  plot_individualFantasyPointsByAgeQB,
  tooltip = c("age","fantasyPoints_gam","text","label")
)
Figure 11.15: Plot of Individuals’ Implied Trajectories of Fantasy Points by Age, from a Generalized Additive Model, for Quarterbacks. Overlaid with the Model-Implied Trajectory for Quarterbacks.

11.3.5.2 Fullbacks

A plot of Fullbacks’ model-implied fantasy points by age is in Figure 11.16. The model-implied peak of Fullbacks’ fantasy points is at age 26. The model-predicted value of zero fantasy points for Fullbacks is at 32.

Code
plot_individualFantasyPointsByAgeFB <- ggplot(
  data = player_stats_seasonal_offense_subsetCC %>% filter(position == "FB"),
  mapping = aes(
    x = age,
    y = fantasyPoints_gam,
    group = player_id)) +
  geom_smooth(
    aes(
      x = age,
      y = fantasyPoints_gam,
      text = player_display_name, # add player name for mouse over tooltip
      label = season # add season for mouse over tooltip
    ),
    se = FALSE,
    linewidth = 0.5,
    color = "black") +
  geom_smooth(
    mapping = aes(
      x = age,
      y = fantasyPoints_gam
    ),
    data = pointsPerSeason_positionAge_newData %>% filter(positionFactor == "FB"),
    inherit.aes = FALSE,
    se = TRUE,
    linewidth = 2
  ) +
  geom_point(
    aes(
      x = age,
      y = fantasyPoints_gam,
      text = player_display_name, # add player name for mouse over tooltip
      label = season # add season for mouse over tooltip
    ),
    size = 1,
    color = "transparent" # make points invisible but keep tooltips
  ) +
  labs(
    x = "Player Age (years)",
    y = "Fantasy Points (Season)",
    title = "Fantasy Points (Season) by Player Age: Fullbacks"
  ) +
  theme_classic()

ggplotly(
  plot_individualFantasyPointsByAgeFB,
  tooltip = c("age","fantasyPoints_gam","text","label")
)
Figure 11.16: Plot of Individuals’ Implied Trajectories of Fantasy Points by Age, from a Generalized Additive Model, for Fullbacks. Overlaid with the Model-Implied Trajectory for Fullbacks.

11.3.5.3 Running Backs

A plot of Running Backs’ model-implied fantasy points by age is in Figure 11.17. The model-implied peak of Running Backs’ fantasy points is at age 20. The model-predicted value of zero fantasy points for Running Backs is at 30.

Code
plot_individualFantasyPointsByAgeRB <- ggplot(
  data = player_stats_seasonal_offense_subsetCC %>% filter(position == "RB"),
  mapping = aes(
    x = age,
    y = fantasyPoints_gam,
    group = player_id)) +
  geom_smooth(
    aes(
      x = age,
      y = fantasyPoints_gam,
      text = player_display_name, # add player name for mouse over tooltip
      label = season # add season for mouse over tooltip
    ),
    se = FALSE,
    linewidth = 0.5,
    color = "black") +
  geom_smooth(
    mapping = aes(
      x = age,
      y = fantasyPoints_gam
    ),
    data = pointsPerSeason_positionAge_newData %>% filter(positionFactor == "RB"),
    inherit.aes = FALSE,
    se = TRUE,
    linewidth = 2
  ) +
  geom_point(
    aes(
      x = age,
      y = fantasyPoints_gam,
      text = player_display_name, # add player name for mouse over tooltip
      label = season # add season for mouse over tooltip
    ),
    size = 1,
    color = "transparent" # make points invisible but keep tooltips
  ) +
  labs(
    x = "Player Age (years)",
    y = "Fantasy Points (Season)",
    title = "Fantasy Points (Season) by Player Age: Running Backs"
  ) +
  theme_classic()

ggplotly(
  plot_individualFantasyPointsByAgeRB,
  tooltip = c("age","fantasyPoints_gam","text","label")
)
Figure 11.17: Plot of Individuals’ Implied Trajectories of Fantasy Points by Age, from a Generalized Additive Model, for Running Backs. Overlaid with the Model-Implied Trajectory for Running Backs.

11.3.5.4 Wide Receivers

A plot of Wide Receivers’ model-implied fantasy points by age is in Figure 11.18. The model-implied peaks of Wide Receivers’ fantasy points are at ages 20 and 26. The model-predicted value of zero fantasy points for Wide Receivers is at 30.

Code
plot_individualFantasyPointsByAgeWR <- ggplot(
  data = player_stats_seasonal_offense_subsetCC %>% filter(position == "WR"),
  mapping = aes(
    x = age,
    y = fantasyPoints_gam,
    group = player_id)) +
  geom_smooth(
    aes(
      x = age,
      y = fantasyPoints_gam,
      text = player_display_name, # add player name for mouse over tooltip
      label = season # add season for mouse over tooltip
    ),
    se = FALSE,
    linewidth = 0.5,
    color = "black") +
  geom_smooth(
    mapping = aes(
      x = age,
      y = fantasyPoints_gam
    ),
    data = pointsPerSeason_positionAge_newData %>% filter(positionFactor == "WR"),
    inherit.aes = FALSE,
    se = TRUE,
    linewidth = 2
  ) +
  geom_point(
    aes(
      x = age,
      y = fantasyPoints_gam,
      text = player_display_name, # add player name for mouse over tooltip
      label = season # add season for mouse over tooltip
    ),
    size = 1,
    color = "transparent" # make points invisible but keep tooltips
  ) +
  labs(
    x = "Player Age (years)",
    y = "Fantasy Points (Season)",
    title = "Fantasy Points (Season) by Player Age: Wide Receivers"
  ) +
  theme_classic()

ggplotly(
  plot_individualFantasyPointsByAgeWR,
  tooltip = c("age","fantasyPoints_gam","text","label")
)
Figure 11.18: Plot of Individuals’ Implied Trajectories of Fantasy Points by Age, from a Generalized Additive Model, for Wide Receivers. Overlaid with the Model-Implied Trajectory for Wide Receivers.

11.3.5.5 Tight Ends

A plot of Tight Ends’ model-implied fantasy points by age is in Figure 11.19. The model-implied peak of Tight Ends’ fantasy points is at age 26. The model-predicted value of zero fantasy points for Tight Ends is at 32.

Code
plot_individualFantasyPointsByAgeTE <- ggplot(
  data = player_stats_seasonal_offense_subsetCC %>% filter(position == "TE"),
  mapping = aes(
    x = age,
    y = fantasyPoints_gam,
    group = player_id)) +
  geom_smooth(
    aes(
      x = age,
      y = fantasyPoints_gam,
      text = player_display_name, # add player name for mouse over tooltip
      label = season # add season for mouse over tooltip
    ),
    se = FALSE,
    linewidth = 0.5,
    color = "black") +
  geom_smooth(
    mapping = aes(
      x = age,
      y = fantasyPoints_gam
    ),
    data = pointsPerSeason_positionAge_newData %>% filter(positionFactor == "TE"),
    inherit.aes = FALSE,
    se = TRUE,
    linewidth = 2
  ) +
  geom_point(
    aes(
      x = age,
      y = fantasyPoints_gam,
      text = player_display_name, # add player name for mouse over tooltip
      label = season # add season for mouse over tooltip
    ),
    size = 1,
    color = "transparent" # make points invisible but keep tooltips
  ) +
  labs(
    x = "Player Age (years)",
    y = "Fantasy Points (Season)",
    title = "Fantasy Points (Season) by Player Age: Tight Ends"
  ) +
  theme_classic()

ggplotly(
  plot_individualFantasyPointsByAgeTE,
  tooltip = c("age","fantasyPoints_gam","text","label")
)
Figure 11.19: Plot of Individuals’ Implied Trajectories of Fantasy Points by Age, from a Generalized Additive Model, for Tight Ends. Overlaid with the Model-Implied Trajectory for Wide Tight Ends.

11.4 Conclusion

Mixed models allow accounting for multiple levels or units of analysis and to include both fixed and random effects. Inclusion of random effects allows effects of a predictor to vary as a function of individuals in the grouping level. This allows for more accurately predicting phenomena. We applied mixed models with random intercepts and random slopes to allow our model estimates to account for the fact that different players have different starting points (intercepts) and different changes over time (slopes) in fantasy points. A quadratic, inverted-U-shaped form as a function of age fit better than a linear form as a function of age in predicting players’ fantasy points. A generalized additive model that allows further nonlinearity fit even better than the quadratic model.

Based on the bivariate scatterplots between age and fantasy points, we might conclude that players tend to stay stable or even increase in fantasy points with age. However, we would be wrong. When we account for the longitudinal data (i.e., multiple observations over time for the same player) using mixed models, we observe that fantasy points tend to decrease with age, with the timing and rate of decline differing for each position. In other words, the association between age and fantasy points differs at the person level versus the group level. This is an example of Simpson’s paradox.

The discrepancy between the positive or null association between age and fantasy points at the group level, and the negative association at the person level may be due, in part, to the selective attrition of players with age. The highest performing players will tend to play the longest, whereas the poorest performing players will retire or get dropped from the team at the youngest ages. Thus, the selective attrition may make it seem that there is no association between age and performance (or even a positive one!), when in fact, players’ performance tends to decrease with age after age 26 or so (with the timing differing from position to position), until the player eventually retires or is dropped from the team. Selective attrition is common in longitudinal studies and intervention studies. For instance, attrition may be more likely for individuals from lower socioeconomic status backgrounds because they may face more challenges in continuing in longitudinal studies such as fewer financial resources, greater life stressors, etc. In addition, people who experience side effects or lack of improvement may be more likely to drop out of intervention studies. Examining only those who completed treatment (an example of selection bias) would make the intervention look more effective than it actually was because the people who stay in the study are those who experience the greatest improvement. Thus, it is important to use approaches such as mixed models or other approaches that account for the multiple observations from the same person, that use all available information, and that do not exclude people who do not complete all portions of the study.

In sum, mixed models are valuable for examining associations between variables when there are multiple levels (i.e., clustering or nesting). It is important not to confuse the association at one level (e.g., group level) with the association at another level (e.g., person level), which is an example of the ecological fallacy.

11.5 Session Info

Code
sessionInfo()
R version 4.4.1 (2024-06-14)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 22.04.4 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0

locale:
 [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8       
 [4] LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8   
 [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C          
[10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   

time zone: UTC
tzcode source: system (glibc)

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] lubridate_1.9.3   forcats_1.0.0     stringr_1.5.1     dplyr_1.1.4      
 [5] purrr_1.0.2       readr_2.1.5       tidyr_1.3.1       tibble_3.2.1     
 [9] tidyverse_2.0.0   viridis_0.6.5     viridisLite_0.4.2 plotly_4.10.4    
[13] ggplot2_3.5.1     AICcmodavg_2.3-3  mgcv_1.9-1        nlme_3.1-164     
[17] sjstats_0.19.0    emmeans_1.10.3    MuMIn_1.48.4      lmerTest_3.1-3   
[21] lme4_1.1-35.5     Matrix_1.7-0      petersenlab_1.0.3

loaded via a namespace (and not attached):
 [1] DBI_1.2.3           mnormt_2.1.1        gridExtra_2.3      
 [4] rlang_1.1.4         magrittr_2.0.3      compiler_4.4.1     
 [7] vctrs_0.6.5         reshape2_1.4.4      quadprog_1.5-8     
[10] pkgconfig_2.0.3     fastmap_1.2.0       backports_1.5.0    
[13] labeling_0.4.3      pbivnorm_0.6.0      utf8_1.2.4         
[16] rmarkdown_2.27      tzdb_0.4.0          nloptr_2.1.1       
[19] xfun_0.46           jsonlite_1.8.8      VGAM_1.1-11        
[22] psych_2.4.6.26      broom_1.0.6         lavaan_0.6-18      
[25] cluster_2.1.6       R6_2.5.1            stringi_1.8.4      
[28] RColorBrewer_1.1-3  boot_1.3-30         rpart_4.1.23       
[31] numDeriv_2016.8-1.1 estimability_1.5.1  Rcpp_1.0.13        
[34] knitr_1.48          base64enc_0.1-3     splines_4.4.1      
[37] nnet_7.3-19         timechange_0.3.0    tidyselect_1.2.1   
[40] rstudioapi_0.16.0   yaml_2.3.10         lattice_0.22-6     
[43] plyr_1.8.9          withr_3.0.0         coda_0.19-4.1      
[46] evaluate_0.24.0     foreign_0.8-86      survival_3.6-4     
[49] pillar_1.9.0        checkmate_2.3.1     stats4_4.4.1       
[52] insight_0.20.2      generics_0.1.3      mix_1.0-12         
[55] hms_1.1.3           munsell_0.5.1       scales_1.3.0       
[58] minqa_1.2.7         xtable_1.8-4        glue_1.7.0         
[61] unmarked_1.4.1      Hmisc_5.1-3         lazyeval_0.2.2     
[64] tools_4.4.1         data.table_1.15.4   mvtnorm_1.2-5      
[67] grid_4.4.1          mitools_2.4         crosstalk_1.2.1    
[70] datawizard_0.12.2   colorspace_2.1-1    performance_0.12.2 
[73] htmlTable_2.4.3     Formula_1.2-5       cli_3.6.3          
[76] fansi_1.0.6         gtable_0.3.5        digest_0.6.36      
[79] pbkrtest_0.5.3      farver_2.1.2        htmlwidgets_1.6.4  
[82] htmltools_0.5.8.1   lifecycle_1.0.4     httr_1.4.7         
[85] MASS_7.3-60.2      

Feedback

Please consider providing feedback about this textbook, so that I can make it as helpful as possible. You can provide feedback at the following link: https://forms.gle/LsnVKwqmS1VuxWD18

Email Notification

The online version of this book will remain open access. If you want to know when the print version of the book is for sale, enter your email below so I can let you know.